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An efficient implementation of the forward-backward least-mean-square 
(FBLMS) adaptive line enhancer is presented in this article. Without changing the 
characteristics of the FBLMS adaptive line enhancer, the proposed implementation 
technique reduces multiplications by 25 percent and additions by 12.5 percent in two 
successive time samples in comparison with those operations of direct implemen- 
tation in both prediction and weight control . The proposed FBLMS architecture 
and algorithm can be applied to digital receivers for enhancing signal-to-noise ratio 
to allow fast carrier acquisition and tracking in both stationary and nonstationary 
environments. 


I. Introduction 

Adaptive line enhancers (ALEs) are useful in many areas, including time-domain spectral estimation 
for fast carrier acquisition [2-4]. For example, a fast carrier acquisition technique [2], 1 as shown in Fig. 1, 
will be very useful for a deep-space mission, especially in a nonstationary environment or emergencies. 
Figure 1 is the block diagram of an ALE in a digital receiver used for both acquisition and tracking. First, 
the receiver is in the acquisition mode. Second, when the uplink carrier is acquired as indicated by the lock 
detector, the switch is shifted to the tracking position and the tracking process takes over immediately. 
With this acquisition scheme, the uplink carrier can be acquired by a transponder in seconds (as opposed 
to minutes for the Cassini transponder). Although devised to support a space mission, the architecture of 
the forward-backward least-mean-square (FBLMS) ALE and the associated algorithm proposed in this 
article are also applicable to other systems, including fixed-ground and mobile communication systems. 
Note that this proposed ALE scheme in the receiver needs a residual carrier, and does not work directly 
in suppressed-carrier cases. 

_ A conventional ALE system using a least-mean-square (LMS) algorithm is depicted in Fig. 2, where 
z 1 represents a delay. The analysis of the ALE for enhancing the signal-tonoise ratio (SNR) to allow 
fast acquisition is given in [2]. The block diagram of a FBLMS adaptive line enhancer is shown in Fig. 3. 
The performance analysis of the FBLMS adaptive line enhancer is provided in [1]. The FBLMS adaptive 
line enhancer algorithm enjoys approximately half the misadjustment of that of the LMS algorithm [1]. 

1 T. M. Nguyen, H. G. Yeh, and L. V. Lam, “A New Carrier Frequency Acquisition Technique for Future Digital Transpon- 
ders,” to be published in a future issue of The Telecommunications and Data Acquisition Progress Report. 
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e/(n) = x(n) - X T (n)W(n) (la) 

e b (n) = x(n - N) - X% (n)W(n) (lb) 

where the superscript T denotes the transpose of a vector, and 

X T (n) = [ x(n — 1 ),x(n — 2), • ■ • ,x(n — IV)] (lc) 

Xf (n) = [x(n - N + 1), x(n - N + 2), • • • ,x(n)] (Id) 

W r (n) = [wi(n),w 2 (n), •■•,«;*(«)] (le) 

In any gradient algorithm, the coefficient vector W(n) is updated using 

W(n + 1) = W(n) - ^v{e(n) 2 } (2a) 


where /i is the adaptive step size and the y{e(n) 2 } is the estimated gradient of the surface of E{e(n) 2 }. 
Note that E { •} denotes the expected value. In the forward-backward algorithm, e(n) 2 = e/(n) 2 + eb(n) 2 , 
and the gradient estimate is chosen as 


V{e(n) 2 } = -(e/(n)X(n) + e 6 (n)X b (n)] 


(2b) 


It is shown in [1] that Eq. (2b) is an unbiased estimator of the gradient. This leads to the coefficient 
update 


W(n + 1) = W(n) + /i[e/(n)X(n) + e&(n)Xf>(n)] 


(2c) 


This means that W(n + 1) = W(n) in steady state when both forward and backward errors are approach- 
ing zero. 


III. The Fast Forward-Backward LMS Algorithm 

The fast FBLMS algorithm is derived in this section by using the radix-2 algorithm on time samples. 
Both predictor and weight update sections are provided in detail. 

A. Predictor Section 

We consider the computation of two successive predictions in both forward and backward directions 
with the fixed weight coefficient W(n — 1). After regrouping even and odd terms, the forward predictor 
is obtained [5] and given in Eq. (3): 


dj(n — 1) 


X T (n — 1) 


A T B T 


'Wo' 

— 

X T (n) _ 

W(n — 1) = 
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(3a) 


where 
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A t = [ x(n - 2), x(n - 4), • • • , x(n - N + 2), x(n - N)} 

B t = [x(n - 3), x(n — 5), • • • ,x(n — N + l),x(n — N — 1)] 


(3b) 


C T = [x(n — l),x(n — 3), • • • ,x(n — N + 3 ),x(n — AT + 1)] 
W 0 = [u>o(n - 1), t/> 2 (n - 1), • • • , wn- 2 (n - 1)] T 
Wi = [^i(n- 1), w 3 (n — 1), •••,iy w _ 1 (n-l)] T 
Similarly, the backward predictor is obtained and given as follows: 


db{n - 1) 


X(n-1 )' 

W(n — 1) = 

- F r g t ' 


‘W 0 ' 

db(n) 


Xj’(n) . 


G t H t 

n 

Wi 


where 

F t = [x(n - N),x(n - N + 2), ■■■,x(n - 4),x(n - 2)] 

G 7 = [x(n — N + l),x(n - AT + 3), • ■ ■ ,x(n - 3),x(n - 1)] 
H r = [x(n - N + 2 ),x(n - AT + 4), ,x(n - 2),x(n)] 


(3c) 

(3d) 

(3e) 

(3f) 


(4a) 


(4b) 

(4c) 

(4d) 


Equations (3a) and (4a) are approximations by virtue of updating the weight vector only once every two 
cycles. The relationship between the two sequence sets {A,B,C} and {F,G,H} is given as follows: 


F = A, 

(5) 

G = C r 

(6) 

_1 H = A r 

(7) 


where subscript r means the reversed order of the sequence and the z 1 means one delay unit of the cor- 
responding sequence and is equivalent to two time sample delays. Furthermore, we observe the following 
relationships between G, B, C: 


G = B r 

(8) 

‘0 = 6 

(9) 


After performing the appropriate computation, Eq. (4a) can be rewritten as follows: 


20 


(10) 


d b (n - 1) 


' G T (W 0 + W x ) + (F - G) T W„ ' 

d b (n) 


_G t (W 0 + Wx) - (G - H) T Wx _ 


The computation of Eq. (4a) requires two inner products of length N, while that of Eq. (10) requires 
only three inner products of length N / 2 and N /2 additions to perform Wo + Wi. Similarly, by combining 
Eqs. (5) through (9), Eq. (3a) can be rewritten as follows: 


d f (n - 1) 


'A T (W 0 + Wx) + (B - A) t Wx' 

. Mn) _ 


_A t (Wo + Wx)-(A-C) t W 0 _ 


' A t (W 0 +W 1 )+r 1 (G- H^W, 
A T (W 0 + Wi) - (F - G)(TW 0 


( 11 ) 


Clearly, the sequences (G - H) and (F - G) of Eq. (10) are reused again in Eq. (11), but in reverse order. 
The computation of Eq. (11) requires only three inner products of length Nj 2. The total number of 
multiplications and additions required in both forward and backward predictor sections for two successive 
computations is about 3 N and 3.5.V. respectively. The total number of multiplications and additions 
required in Eqs. (la) and (lb) for two successive prediction sections is 4 N and 4 (N — 1). Consequently, 
there are about 25 percent and 12.5 percent savings in multiplications and additions, respectively. 

B. Weight Update Section 

We consider the weight coefficient updates now. Since weights are explicitly computed at every other 
time update using the look-ahead approach [6], the weight update of Eq. (2c) can be rewritten as follows: 

W(n + 1) = W(n - 1) + n [e/(n - l)X(n - 1) + e b (n - l)Xf,(n - 1)] + n [e/(n)X(n) 4- et(n)X 6 (n)] 


= W(n-l) + [X(n) X(n 


1)1 


pe/(n) 
pe/(n- 1) 


+ [X(,(n) X 6 (n-1)] 


fie b (n) 
He b (n - 1 ) 


( 12 ) 


By combining Eqs. (5) through (9), Eq. (12) is rewritten as follows: 
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A (e/(n) + e/(n - 1)) - (F - G) r e/(n) 
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+ /i 


G(eft(n) + e b (n — 1)) + (F — G )e b (n — 1) 
G(e b (n) + e b (n - 1)) - (G - H)e fc (n) 


(13) 
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The vectors (F - G) and (G - H) are once more employed in Eq. (13). Notice that the term /i[A(e/(n) 
+ e/(n - 1)) + G(e&(n) + e6(n - 1))] is computed only once, and the sum is applied to both Wo and 
Wi for updates. The total numbers of multiplications and additions in Eq. (13) are about 3 N and 3.5 N, 
respectively. However, the total numbers of multiplications and additions of Eq. (2c) for two adaptations 
are 4 N and A(N — 1). Consequently, 25 percent of multiplications and 12.5 percent of additions are saved 
by using Eq. (13) in comparison with those operations of Eq. (2c). 


IV. Implementation 

The architecture of the fast FBLMS algorithm is depicted in Fig. 4. A switching circuit is employed 
after the adaptive line enhancer, and the switch rate (from C to A or from A to C) is the same as 
the sampling rate. The switching circuit is switched between points C and A alternately. Sequences 
C and A are generated at a rate of 1/(2T) accordingly. The sequence B is a delayed version of the 
sequence C. By using a radix-2 structure, sequences {B — A} and {A — C} are then generated at the 
upper and lower lag, respectively. By using the sequence {B — A}, inner products (B — A) t Wi and 
z~ l ( G — H) t Wj are generated at the upper and lower lag, respectively, of the upper forward-backward 
tapped-delay-line structure. Similarly, by using the sequence {A — C}, inner products (A — C) T Wo and 
(F-G) r Wo are generated at the upper and lower lag, respectively, of the lower forward-backward tapped- 
delay-line structure. Note that vectors F, G, and H are defined in Eqs. (5), (6), and (7), respectively. 
Inner products of A T (Wo + Wi) and G t (Wq + Wi) are computed at the top and bottom portions, 
respectively, of the fast FBLMS architecture. Finally, forward errors {e/(n) and e/(n- 1)} and backward 
errors {et(n — 1) and eb(n — 2)} are computed at the right-hand side of Fig. 4. In order to subtract the 
term of z~ l ( G — H) T Wi and form the backward error, a delay unit is applied to the output branch of 
the inner product of G t (Wq + Wi). Consequently, the corresponding backward error is delayed from 
e&(n) to e&(n — 2). Notice that this radix-2 structure concept can be applied again to the upper and lower 
forward-backward taped-delay-line portion of the fast FBLMS algorithm to further reduce the number 
of multiplications and additions. 

Although the fast FBLMS architecture shown in Fig. 4 appears more complex than the FBLMS 
shown in Fig. 3, the structure is still very simple. In fact, the fast FBLMS architecture consists of 
radix-2, forward LMS, and FBLMS structures. The increased data flow complexity over the FBLMS 
algorithm is limited; therefore, the fast FBLMS algorithm can be easily implemented with digital signal 
processors. 


V. Simulation Results 

An adaptive line enhancer with 6- weight (N = 6) is chosen as an example. The input signal is a sinusoid 
of frequency /o contaminated by white noise. Computer simulations are conducted for the misadjustment 
calculation by using forward LMS, FBLMS, and fast FBLMS algorithms. The misadjustment [1] is 
computed after convergence as follows: 


M = 


extra output power due to weight jittering 
minimum output power 


£[A(n) T (fr( i,z)A(n)] 

E[e(n) 2 } opt 


( 14 ) 


where 
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Fig. 4. The architecture of the fast FBLMS algorithm. 





A(n) = W(n) - W opt 

(15) 

<t>(x,x) = E[X(n)X T {n)} 

(16) 

E[e{n)\ pt = E[x(n) 2 } - W^£?[*(n)X(n)] 

(17) 


Table 1 shows the measured misadj us t merits for various values of SNR at step size fi = 2~ 8 . Apparently, 
the excess error power for both the FBLMS and the fast FBLMS algorithms is approximately half that of 
the forward LMS algorithm at the 10-dB SNR. The improvement of the misadj ustment by using both the 
FBLMS and the fast FBLMS algorithms over that of the forward LMS algorithm is limited at an SNR 
around 0 dB. However, the misadjustment of the fast FBLMS algorithm is about the same as that of 
the FBLMS algorithm. Furthermore, it is observed in Table 1 that, at a higher SNR, the misadjustment 
increases (for a given step size /i = 2“ 8 ). This is because the minimum output error power decreases 
much more rapidly than the extra output power due to weight jittering, as depicted by Eq. (14). This 
high misadjustment is significantly reduced when the step size ji is cut to 2 -10 , as shown in Table 2. 

Table 2 shows the measured misadj ustments for various values of the step size and the frequency /o at 
SNR = 10 dB. Apparently, the excess error power for both the FBLMS and the fast FBLMS algorithms 
is approximately half that of the forward LMS algorithm at the step size /x = 2 -8 and /x = 2~ 10 . 
The misadjustment is much reduced when the step size is small (2 -10 ) by using any one of the three 
algorithms. Again, the misadjustment of the fast FBLMS algorithm is about the same as that of the 
FBLMS. The E[e(n) 2 ] op t used to derive the misadjustment is computed by using 500 samples in each run. 
The misadjustment results listed in Tables 1 and 2 were obtained by averaging 100 runs of the excess 
error power curves after convergence had been achieved. 

Table 1. A comparison between the misadjustment powers of 
three algorithms at /x = 2 -8 . 


Percent misadjustment 


SNR f 0 




Forward LMS 

FBLMS 

Fast FBLMS 

0 

0.1 

3.04 

2.75 

2.75 

3 

0.1 

3.74 

2.84 

2.93 

10 

0.1 

32.50 

13.77 

16.95 

Table 2. A comparison between the misadjustment powers of three 
algorithms using fixed SNR = 10 dB with different fji. 

ft 

So 

Percent misadjustment 


Forward LMS 

FBLMS 

Fast FBLMS 

2-8 

0.1667 

31.34 

14.47 

16.03 

2-8 

0.1 

32.5 

13.77 

16.95 

2 -io 

0.1667 

3.06 

2.05 

1.99 

2-1° 

0.1 

2.33 

1.24 

1.30 
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Fig. 5. A typical excess error power versus n plot by using the (a) forward LMS, (b) FBLMS, (c) fast FBLMS 

algorithm, and (d) the steady-state comparison. 


Figures 5(a), (b), and (c) show a typical excess error power versus n plot at fo — 1/6, step size = 2 -8 , 
and SNR = 10 dB for the forward LMS, FBLMS, and fast FBLMS algorithms, respectively. Figure 5(d) 
shows the excess error power at the steady state. It is clear that the performance of the fast FBLMS 
algorithm is about the same as that of the FBLMS algorithm. 


VI. Conclusion 

The fast forward-backward LMS algorithm presented in this article shows that the number of arith- 
metic operations in [1] can be reduced without degrading performance. In the forward-backward predictor 
section, 25 percent of multiplications and 12.5 percent of additions are saved in each of two successive 
operations. Similarly, in the weight control section, 25 percent of multiplications and 12.5 percent of 
additions are saved in each of two adaptations. Simulation results indicate that improvements in misad- 
justment for both the FBLMS and the fast FBLMS algorithms over the conventional LMS algorithm are 
about 50 percent at a high SNR. When the SNR is low, the misadjustment improvement for both the 
FBLMS and the fast FBLMS algorithms over the conventional LMS algorithm is less than 50 percent. 
Notice that this feist forward-backward LMS algorithm is well suited for implementation on application- 
specific integrated circuits and digital signal processors. This implementation method can be generalized 
by using higher than two steps of look-ahead. Further computational savings are possible with limited cost 
on controlling appropriate data flow. This fast FBLMS adaptive line enhancer can be easily integrated 
together with either a conventional voltage-controlled oscillator in a closed loop for acquisition/tracking, 
as used in the present deep-space transponder, or a numerically controlled oscillator in an open-loop 
scheme for acquiring and tracking the carrier signal, as will be used in future deep-space transponders. 
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